Handling Missing Attribute Values
نویسندگان
چکیده
In this chapter methods of handling missing attribute values in data mining are described. These methods are categorized into sequential and parallel. In sequential methods, missing attribute values are replaced by known values first, as a preprocessing, then the knowledge is acquired for a data set with all known attribute values. In parallel methods, there is no preprocessing, i.e., knowledge is acquired directly from the original data sets. In this chapter the main emphasis is put on rule induction. Methods of handling attribute values for decision tree generation are only briefly summarized.
منابع مشابه
A comparison of traditional and rough set approaches to missing attribute values in data mining
Real-life data sets are often incomplete, i.e., some attribute values are missing. In this paper we compare traditional, frequently used methods of handling missing attribute values, which are based on preprocessing, with another class of methods dealing with missing attribute values in which rule induction is performed directly on incomplete data sets, i.e., handling missing attribute values a...
متن کاملA Closest Fit Approach to Missing Attribute VAlues in Preterm Birth Data
Recently, results on a comparison of seven successful methods of handling missing attribute values were reported. This paper describes experimental results on the three most successful methods out of these seven. Two of these methods, based on a Closet Fit idea (searching in a remaining data set for the closest fit case and replacing a missing attribute value by the corresponding known value fr...
متن کاملClassifying Unseen Cases with Many
Handling missing attribute values is an important issue for classiier learning, since missing attribute values in either training data or test (unseen) data aaect the prediction accuracy of learned classi-ers. In many real KDD applications, attributes with missing values are very common. This paper studies the robustness of four recently developed committee learning techniques, including Boosti...
متن کاملSolving Incomplete Datasets in Soft Set Using Supported Sets and Aggregate Values
The theory of soft set proposed by Molodtsovin 1999[1]is a new method for handling uncertain data and can be defined as a Boolean-valued information system. Ithas been applied to data analysis and decision support systems based on large datasets. In this paper, it is shown that calculated support value can be used to determine missing attribute value of an object. However, in cases when more th...
متن کامل